Handling high-dimensional data in air pollution forecasting tasks
نویسندگان
چکیده
In the paper methods aimed at handling high-dimensional weather forecasts data used to predict the concentrations of PM10, PM2.5, SO2, NO, CO and O3 are being proposed. The procedure employed to predict pollution normally requires historical data samples for a large number of points in time – particularly weather forecast data, actual weather data and pollution data. Likewise, it typically involves using numerous features related to atmospheric conditions. Consequently the analysis of such datasets to generate accurate forecasts becomes very cumbersome task. The paper examines a variety of unsupervised dimensionality reduction methods aimed at obtaining compact yet informative set of features. As an alternative, approach using fractional distances for data analysis tasks is being considered as well. Both strategies were evaluated on real-world data obtained from the Institute of Meteorology and Water Management in Katowice (Poland), with extended Air Pollution Forecast Model (e-APFM) being used as underlying prediction tool. It was found that employing fractional distance as a dissimilarity measure ensures the best accuracy of forecasting. Satisfactory results can be also obtained with Isomap, Landmark Isomap and Factor Analysis as dimensionality reduction techniques. These methods can be also used to formulate universal mapping, ready-to-use for data gathered at different geographical areas. Email addresses: [email protected] (Diana Domańska), [email protected] (Szymon Lukasik) Preprint submitted to Ecological Informatics November 16, 2016
منابع مشابه
Forecasting Air Pollution Concentrations in Iran, Using a Hybrid Model
The present study aims at developing a forecasting model to predict the next year’s air pollution concentrations in the atmosphere of Iran. In this regard, it proposes the use of ARIMA, SVR, and TSVR, as well as hybrid ARIMA-SVR and ARIMA-TSVR models, which combined the autoregressive part of the autoregressive integrated moving average (ARIMA) model with the support vector regression technique...
متن کاملThe fuzzy logic in air pollution forecasting model
In the paper a model to predict the concentrations of particulate matter PM10, PM2.5, SO2, NO, CO and O3 for a chosen number of hours forward is proposed. The method requires historical data for a large number of points in time, particularly weather forecast data, actual weather data and pollution data. The idea is that by matching forecast data with similar forecast data in the historical data...
متن کاملForecasting Ozone Density in Tehran Air Using a Smart Data-Driven Approach
Introduction: As a metropolitan area in Iran, Tehran is exposed to damage from air pollution due to its large population and pollutants from various sources. Accordingly, research on damage induced by air pollution in this city seems necessary. The main purpose of this study was to forecast ozone in the city of Tehran. Considering the hazards of ozone (O3) gas on human health and the environmen...
متن کاملForecasting Air Pollution Concentrations in Iran, Using a Hybrid Model
The present study aims at developing a forecasting model to predict the next year’s air pollution concentrations in the atmosphere of Iran. In this regard, it proposes the use of ARIMA, SVR, and TSVR, as well as hybrid ARIMA-SVR and ARIMA-TSVR models, which combined the autoregressive part of the autoregressive integrated moving average (ARIMA) model with the support vector regression technique...
متن کاملCapabilities of data assimilation in correcting sea surface temperature in the Persian Gulf
Predicting the quality of water and air is a particular challenge for forecasting systems that support them. In order to represent the small-scale phenomena, a high-resolution model needs accurate capture of air and sea circulations, significant for forecasting environmental pollution. Data assimilation is one of the state of the art methods to be used for this purpose. Due to the importance of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Ecological Informatics
دوره 34 شماره
صفحات -
تاریخ انتشار 2016